The search functionality is under construction.

Keyword Search Result

[Keyword] neural network(855hit)

181-200hit(855hit)

  • Multi-Targeted Backdoor: Indentifying Backdoor Attack for Multiple Deep Neural Networks

    Hyun KWON  Hyunsoo YOON  Ki-Woong PARK  

     
    LETTER-Information Network

      Pubricized:
    2020/01/15
      Vol:
    E103-D No:4
      Page(s):
    883-887

    We propose a multi-targeted backdoor that misleads different models to different classes. The method trains multiple models with data that include specific triggers that will be misclassified by different models into different classes. For example, an attacker can use a single multi-targeted backdoor sample to make model A recognize it as a stop sign, model B as a left-turn sign, model C as a right-turn sign, and model D as a U-turn sign. We used MNIST and Fashion-MNIST as experimental datasets and Tensorflow as a machine learning library. Experimental results show that the proposed method with a trigger can cause misclassification as different classes by different models with a 100% attack success rate on MNIST and Fashion-MNIST while maintaining the 97.18% and 91.1% accuracy, respectively, on data without a trigger.

  • The Role of Accent and Grouping Structures in Estimating Musical Meter

    Han-Ying LIN  Chien-Chieh HUANG  Wen-Whei CHANG  Jen-Tzung CHIEN  

     
    PAPER-Engineering Acoustics

      Vol:
    E103-A No:4
      Page(s):
    649-656

    This study presents a new method to exploit both accent and grouping structures of music in meter estimation. The system starts by extracting autocorrelation-based features that characterize accent periodicities. Based on the local boundary detection model, we construct grouping features that serve as additional cues for inferring meter. After the feature extraction, a multi-layer cascaded classifier based on neural network is incorporated to derive the most likely meter of input melody. Experiments on 7351 folk melodies in MIDI files indicate that the proposed system achieves an accuracy of 95.76% for classification into nine categories of meters.

  • Software Development Effort Estimation from Unstructured Software Project Description by Sequence Models

    Tachanun KANGWANTRAKOOL  Kobkrit VIRIYAYUDHAKORN  Thanaruk THEERAMUNKONG  

     
    PAPER

      Pubricized:
    2020/01/14
      Vol:
    E103-D No:4
      Page(s):
    739-747

    Most existing methods of effort estimations in software development are manual, labor-intensive and subjective, resulting in overestimation with bidding fail, and underestimation with money loss. This paper investigates effectiveness of sequence models on estimating development effort, in the form of man-months, from software project data. Four architectures; (1) Average word-vector with Multi-layer Perceptron (MLP), (2) Average word-vector with Support Vector Regression (SVR), (3) Gated Recurrent Unit (GRU) sequence model, and (4) Long short-term memory (LSTM) sequence model are compared in terms of man-months difference. The approach is evaluated using two datasets; ISEM (1,573 English software project descriptions) and ISBSG (9,100 software projects data), where the former is a raw text and the latter is a structured data table explained the characteristic of a software project. The LSTM sequence model achieves the lowest and the second lowest mean absolute errors, which are 0.705 and 14.077 man-months for ISEM and ISBSG datasets respectively. The MLP model achieves the lowest mean absolute errors which is 14.069 for ISBSG datasets.

  • The Effect of Axis-Wise Triaxial Acceleration Data Fusion in CNN-Based Human Activity Recognition

    Xinxin HAN  Jian YE  Jia LUO  Haiying ZHOU  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2020/01/14
      Vol:
    E103-D No:4
      Page(s):
    813-824

    The triaxial accelerometer is one of the most important sensors for human activity recognition (HAR). It has been observed that the relations between the axes of a triaxial accelerometer plays a significant role in improving the accuracy of activity recognition. However, the existing research rarely focuses on these relations, but rather on the fusion of multiple sensors. In this paper, we propose a data fusion-based convolutional neural network (CNN) approach to effectively use the relations between the axes. We design a single-channel data fusion method and multichannel data fusion method in consideration of the diversified formats of sensor data. After obtaining the fused data, a CNN is used to extract the features and perform classification. The experiments show that the proposed approach has an advantage over the CNN in accuracy. Moreover, the single-channel model achieves an accuracy of 98.83% with the WISDM dataset, which is higher than that of state-of-the-art methods.

  • Real-Time Generic Object Tracking via Recurrent Regression Network

    Rui CHEN  Ying TONG  Ruiyu LIANG  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/12/20
      Vol:
    E103-D No:3
      Page(s):
    602-611

    Deep neural networks have achieved great success in visual tracking by learning a generic representation and leveraging large amounts of training data to improve performance. Most generic object trackers are trained from scratch online and do not benefit from a large number of videos available for offline training. We present a real-time generic object tracker capable of incorporating temporal information into its model, learning from many examples offline and quickly updating online. During the training process, the pre-trained weight of convolution layer is updated lagging behind, and the input video sequence length is gradually increased for fast convergence. Furthermore, only the hidden states in recurrent network are updated to guarantee the real-time tracking speed. The experimental results show that the proposed tracking method is capable of tracking objects at 150 fps with higher predicting overlap rate, and achieves more robustness in multiple benchmarks than state-of-the-art performance.

  • Combining CNN and Broad Learning for Music Classification

    Huan TANG  Ning CHEN  

     
    PAPER-Music Information Processing

      Pubricized:
    2019/12/05
      Vol:
    E103-D No:3
      Page(s):
    695-701

    Music classification has been inspired by the remarkable success of deep learning. To enhance efficiency and ensure high performance at the same time, a hybrid architecture that combines deep learning and Broad Learning (BL) is proposed for music classification tasks. At the feature extraction stage, the Random CNN (RCNN) is adopted to analyze the Mel-spectrogram of the input music sound. Compared with conventional CNN, RCNN has more flexible structure to adapt to the variance contained in different types of music. At the prediction stage, the BL technique is introduced to enhance the prediction accuracy and reduce the training time as well. Experimental results on three benchmark datasets (GTZAN, Ballroom, and Emotion) demonstrate that: i) The proposed scheme achieves higher classification accuracy than the deep learning based one, which combines CNN and LSTM, on all three benchmark datasets. ii) Both RCNN and BL contribute to the performance improvement of the proposed scheme. iii) The introduction of BL also helps to enhance the prediction efficiency of the proposed scheme.

  • Broadband Direction of Arrival Estimation Based on Convolutional Neural Network Open Access

    Wenli ZHU  Min ZHANG  Chenxi WU  Lingqing ZENG  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2019/08/27
      Vol:
    E103-B No:3
      Page(s):
    148-154

    A convolutional neural network (CNN) for broadband direction of arrival (DOA) estimation of far-field electromagnetic signals is presented. The proposed algorithm performs a nonlinear inverse mapping from received signal to angle of arrival. The signal model used for algorithm is based on the circular antenna array geometry, and the phase component extracted from the spatial covariance matrix is used as the input of the CNN network. A CNN model including three convolutional layers is then established to approximate the nonlinear mapping. The performance of the CNN model is evaluated in a noisy environment for various values of signal-to-noise ratio (SNR). The results demonstrate that the proposed CNN model with the phase component of the spatial covariance matrix as the input is able to achieve fast and accurate broadband DOA estimation and attains perfect performance at lower SNR values.

  • Fast Inference of Binarized Convolutional Neural Networks Exploiting Max Pooling with Modified Block Structure

    Ji-Hoon SHIN  Tae-Hwan KIM  

     
    LETTER-Software System

      Pubricized:
    2019/12/03
      Vol:
    E103-D No:3
      Page(s):
    706-710

    This letter presents a novel technique to achieve a fast inference of the binarized convolutional neural networks (BCNN). The proposed technique modifies the structure of the constituent blocks of the BCNN model so that the input elements for the max-pooling operation are binary. In this structure, if any of the input elements is +1, the result of the pooling can be produced immediately; the proposed technique eliminates such computations that are involved to obtain the remaining input elements, so as to reduce the inference time effectively. The proposed technique reduces the inference time by up to 34.11%, while maintaining the classification accuracy.

  • Edge-SiamNet and Edge-TripleNet: New Deep Learning Models for Handwritten Numeral Recognition

    Weiwei JIANG  Le ZHANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2019/12/09
      Vol:
    E103-D No:3
      Page(s):
    720-723

    Handwritten numeral recognition is a classical and important task in the computer vision area. We propose two novel deep learning models for this task, which combine the edge extraction method and Siamese/Triple network structures. We evaluate the models on seven handwritten numeral datasets and the results demonstrate both the simplicity and effectiveness of our models, comparing to baseline methods.

  • Knowledge Discovery from Layered Neural Networks Based on Non-negative Task Matrix Decomposition

    Chihiro WATANABE  Kaoru HIRAMATSU  Kunio KASHINO  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/10/23
      Vol:
    E103-D No:2
      Page(s):
    390-397

    Interpretability has become an important issue in the machine learning field, along with the success of layered neural networks in various practical tasks. Since a trained layered neural network consists of a complex nonlinear relationship between large number of parameters, we failed to understand how they could achieve input-output mappings with a given data set. In this paper, we propose the non-negative task matrix decomposition method, which applies non-negative matrix factorization to a trained layered neural network. This enables us to decompose the inference mechanism of a trained layered neural network into multiple principal tasks of input-output mapping, and reveal the roles of hidden units in terms of their contribution to each principal task.

  • On Performance of Deep Learning for Harmonic Spur Cancellation in OFDM Systems

    Ziming HE  

     
    LETTER-Mobile Information Network and Personal Communications

      Vol:
    E103-A No:2
      Page(s):
    576-579

    In this letter, the performance of a state-of-the-art deep learning (DL) algorithm in [5] is analyzed and evaluated for orthogonal frequency-division multiplexing (OFDM) receivers, in the presence of harmonic spur interference. Moreover, a novel spur cancellation receiver structure and algorithm are proposed to enhance the traditional OFDM receivers, and serve as a performance benchmark for the DL algorithm. It is found that the DL algorithm outperforms the traditional algorithm and is much more robust to spur carrier frequency offset.

  • A New GAN-Based Anomaly Detection (GBAD) Approach for Multi-Threat Object Classification on Large-Scale X-Ray Security Images

    Joanna Kazzandra DUMAGPI  Woo-Young JUNG  Yong-Jin JEONG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/10/23
      Vol:
    E103-D No:2
      Page(s):
    454-458

    Threat object recognition in x-ray security images is one of the important practical applications of computer vision. However, research in this field has been limited by the lack of available dataset that would mirror the practical setting for such applications. In this paper, we present a novel GAN-based anomaly detection (GBAD) approach as a solution to the extreme class-imbalance problem in multi-label classification. This method helps in suppressing the surge in false positives induced by training a CNN on a non-practical dataset. We evaluate our method on a large-scale x-ray image database to closely emulate practical scenarios in port security inspection systems. Experiments demonstrate improvement against the existing algorithm.

  • Cross-Corpus Speech Emotion Recognition Based on Deep Domain-Adaptive Convolutional Neural Network

    Jiateng LIU  Wenming ZHENG  Yuan ZONG  Cheng LU  Chuangao TANG  

     
    LETTER-Pattern Recognition

      Pubricized:
    2019/11/07
      Vol:
    E103-D No:2
      Page(s):
    459-463

    In this letter, we propose a novel deep domain-adaptive convolutional neural network (DDACNN) model to handle the challenging cross-corpus speech emotion recognition (SER) problem. The framework of the DDACNN model consists of two components: a feature extraction model based on a deep convolutional neural network (DCNN) and a domain-adaptive (DA) layer added in the DCNN utilizing the maximum mean discrepancy (MMD) criterion. We use labeled spectrograms from source speech corpus combined with unlabeled spectrograms from target speech corpus as the input of two classic DCNNs to extract the emotional features of speech, and train the model with a special mixed loss combined with a cross-entrophy loss and an MMD loss. Compared to other classic cross-corpus SER methods, the major advantage of the DDACNN model is that it can extract robust speech features which are time-frequency related by spectrograms and narrow the discrepancies between feature distribution of source corpus and target corpus to get better cross-corpus performance. Through several cross-corpus SER experiments, our DDACNN achieved the state-of-the-art performance on three public emotion speech corpora and is proved to handle the cross-corpus SER problem efficiently.

  • Recurrent Neural Network Compression Based on Low-Rank Tensor Representation

    Andros TJANDRA  Sakriani SAKTI  Satoshi NAKAMURA  

     
    PAPER-Music Information Processing

      Pubricized:
    2019/10/17
      Vol:
    E103-D No:2
      Page(s):
    435-449

    Recurrent Neural Network (RNN) has achieved many state-of-the-art performances on various complex tasks related to the temporal and sequential data. But most of these RNNs require much computational power and a huge number of parameters for both training and inference stage. Several tensor decomposition methods are included such as CANDECOMP/PARAFAC (CP), Tucker decomposition and Tensor Train (TT) to re-parameterize the Gated Recurrent Unit (GRU) RNN. First, we evaluate all tensor-based RNNs performance on sequence modeling tasks with a various number of parameters. Based on our experiment results, TT-GRU achieved the best results in a various number of parameters compared to other decomposition methods. Later, we evaluate our proposed TT-GRU with speech recognition task. We compressed the bidirectional GRU layers inside DeepSpeech2 architecture. Based on our experiment result, our proposed TT-format GRU are able to preserve the performance while reducing the number of GRU parameters significantly compared to the uncompressed GRU.

  • Blind Detection Algorithm Based on Spectrum Sharing and Coexistence for Machine-to-Machine Communication

    Yun ZHANG  Bingrui LI  Shujuan YU  Meisheng ZHAO  

     
    PAPER-Analog Signal Processing

      Vol:
    E103-A No:1
      Page(s):
    297-302

    In this paper, we propose a new scheme which uses blind detection algorithm for recovering the conventional user signal in a system which the sporadic machine-to-machine (M2M) communication share the same spectrum with the conventional user. Compressive sensing techniques are used to estimate the M2M devices signals. Based on the Hopfield neural network (HNN), the blind detection algorithm is used to recover the conventional user signal. The simulation results show that the conventional user signal can be effectively restored under an unknown channel. Compared with the existing methods, such as using the training sequence to estimate the channel in advance, the blind detection algorithm used in this paper with no need for identifying the channel, and can directly detect the transmitted signal blindly.

  • Attribute-Aware Loss Function for Accurate Semantic Segmentation Considering the Pedestrian Orientations Open Access

    Mahmud Dwi SULISTIYO  Yasutomo KAWANISHI  Daisuke DEGUCHI  Ichiro IDE  Takatsugu HIRAYAMA  Jiang-Yu ZHENG  Hiroshi MURASE  

     
    PAPER

      Vol:
    E103-A No:1
      Page(s):
    231-242

    Numerous applications such as autonomous driving, satellite imagery sensing, and biomedical imaging use computer vision as an important tool for perception tasks. For Intelligent Transportation Systems (ITS), it is required to precisely recognize and locate scenes in sensor data. Semantic segmentation is one of computer vision methods intended to perform such tasks. However, the existing semantic segmentation tasks label each pixel with a single object's class. Recognizing object attributes, e.g., pedestrian orientation, will be more informative and help for a better scene understanding. Thus, we propose a method to perform semantic segmentation with pedestrian attribute recognition simultaneously. We introduce an attribute-aware loss function that can be applied to an arbitrary base model. Furthermore, a re-annotation to the existing Cityscapes dataset enriches the ground-truth labels by annotating the attributes of pedestrian orientation. We implement the proposed method and compare the experimental results with others. The attribute-aware semantic segmentation shows the ability to outperform baseline methods both in the traditional object segmentation task and the expanded attribute detection task.

  • Convolutional Neural Networks for Pilot-Induced Cyclostationarity Based OFDM Signals Spectrum Sensing in Full-Duplex Cognitive Radio

    Hang LIU  Xu ZHU  Takeo FUJII  

     
    PAPER-Terrestrial Wireless Communication/Broadcasting Technologies

      Pubricized:
    2019/07/16
      Vol:
    E103-B No:1
      Page(s):
    91-102

    The spectrum sensing of the orthogonal frequency division multiplexing (OFDM) system in cognitive radio (CR) has always been challenging, especially for user terminals that utilize the full-duplex (FD) mode. We herein propose an advanced FD spectrum-sensing scheme that can be successfully performed even when severe self-interference is encountered from the user terminal. Based on the “classification-converted sensing” framework, the cyclostationary periodogram generated by OFDM pilots is exhibited in the form of images. These images are subsequently plugged into convolutional neural networks (CNNs) for classifications owing to the CNN's strength in image recognition. More importantly, to realize spectrum sensing against residual self-interference, noise pollution, and channel fading, we used adversarial training, where a CR-specific, modified training database was proposed. We analyzed the performances exhibited by the different architectures of the CNN and the different resolutions of the input image to balance the detection performance with computing capability. We proposed a design plan of the signal structure for the CR transmitting terminal that can fit into the proposed spectrum-sensing scheme while benefiting from its own transmission. The simulation results prove that our method has excellent sensing capability for the FD system; furthermore, our method achieves a higher detection accuracy than the conventional method.

  • Neural Watermarking Method Including an Attack Simulator against Rotation and Compression Attacks

    Ippei HAMAMOTO  Masaki KAWAMURA  

     
    PAPER

      Pubricized:
    2019/10/23
      Vol:
    E103-D No:1
      Page(s):
    33-41

    We have developed a digital watermarking method that use neural networks to learn embedding and extraction processes that are robust against rotation and JPEG compression. The proposed neural networks consist of a stego-image generator, a watermark extractor, a stego-image discriminator, and an attack simulator. The attack simulator consists of a rotation layer and an additive noise layer, which simulate the rotation attack and the JPEG compression attack, respectively. The stego-image generator can learn embedding that is robust against these attacks, and also, the watermark extractor can extract watermarks without rotation synchronization. The quality of the stego-images can be improved by using the stego-image discriminator, which is a type of adversarial network. We evaluated the robustness of the watermarks and image quality and found that, using the proposed method, high-quality stego-images could be generated and the neural networks could be trained to embed and extract watermarks that are robust against rotation and JPEG compression attacks. We also showed that the robustness and image quality can be adjusted by changing the noise strength in the noise layer.

  • SDChannelNets: Extremely Small and Efficient Convolutional Neural Networks

    JianNan ZHANG  JiJun ZHOU  JianFeng WU  ShengYing YANG  

     
    LETTER-Biocybernetics, Neurocomputing

      Pubricized:
    2019/09/10
      Vol:
    E102-D No:12
      Page(s):
    2646-2650

    Convolutional neural networks (CNNS) have a strong ability to understand and judge images. However, the enormous parameters and computation of CNNS have limited its application in resource-limited devices. In this letter, we used the idea of parameter sharing and dense connection to compress the parameters in the convolution kernel channel direction, thus greatly reducing the number of model parameters. On this basis, we designed Shared and Dense Channel-wise Convolutional Networks (SDChannelNets), mainly composed of Depth-wise Separable SD-Channel-wise Convolution layer. The advantage of SDChannelNets is that the number of model parameters is greatly reduced without or with little loss of accuracy. We also introduced a hyperparameter that can effectively balance the number of parameters and the accuracy of a model. We evaluated the model proposed by us through two popular image recognition tasks (CIFAR-10 and CIFAR-100). The results showed that SDChannelNets had similar accuracy to other CNNs, but the number of parameters was greatly reduced.

  • Detecting Surface Defects of Wind Tubine Blades Using an Alexnet Deep Learning Algorithm Open Access

    Xiao-Yi ZHAO  Chao-Yi DONG  Peng ZHOU  Mei-Jia ZHU  Jing-Wen REN  Xiao-Yan CHEN  

     
    PAPER-Machine Learning

      Vol:
    E102-A No:12
      Page(s):
    1817-1824

    The paper employed an Alexnet, which is a deep learning framework, to automatically diagnose the damages of wind power generator blade surfaces. The original images of wind power generator blade surfaces were captured by machine visions of a 4-rotor UAV (unmanned aerial vehicle). Firstly, an 8-layer Alexnet, totally including 21 functional sub-layers, is constructed and parameterized. Secondly, the Alexnet was trained with 10000 images and then was tested by 6-turn 350 images. Finally, the statistic of network tests shows that the average accuracy of damage diagnosis by Alexnet is about 99.001%. We also trained and tested a traditional BP (Back Propagation) neural network, which have 20-neuron input layer, 5-neuron hidden layer, and 1-neuron output layer, with the same image data. The average accuracy of damage diagnosis of BP neural network is 19.424% lower than that of Alexnet. The point shows that it is feasible to apply the UAV image acquisition and the deep learning classifier to diagnose the damages of wind turbine blades in service automatically.

181-200hit(855hit)